AITopics | semi-bandit feedback

Collaborating Authors

semi-bandit feedback

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

47561f5e1dc53c7d119185e217b523d0-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 16:57:07 GMT

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Industry:

Information Technology (0.46)
Energy (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.70)
Information Technology > Data Science (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.47)

Add feedback

Stochastic Online Greedy Learning with Semi-bandit Feedbacks

Neural Information Processing SystemsDec-27-2025, 15:04:20 GMT

The greedy algorithm is extensively studied in the field of combinatorial optimization for decades. In this paper, we address the online learning problem when the input to the greedy algorithm is stochastic with unknown parameters that have to be learned over time. We first propose the greedy regret and $\epsilon$-quasi greedy regret as learning metrics comparing with the performance of offline greedy algorithm. We then propose two online greedy learning algorithms with semi-bandit feedbacks, which use multi-armed bandit and pure exploration bandit policies at each level of greedy learning, one for each of the regret metrics respectively. Both algorithms achieve $O(\log T)$ problem-dependent regret bound ($T$ being the time horizon) for a general class of combinatorial structures and reward functions that allow greedy solutions. We further show that the bound is tight in $T$ and other problem instance parameters.

name change, semi-bandit feedback, stochastic online greedy learning, (2 more...)

Neural Information Processing Systems

Industry: Education (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback

Zheng Wen, Branislav Kveton, Michal Valko, Sharan Vaswani

Neural Information Processing SystemsNov-21-2025, 10:57:17 GMT

We study the online influence maximization problem in social networks under the independent cascade model. Specifically, we aim to learn the set of "best

data mining, machine learning, node, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > Canada > British Columbia (0.04)
(2 more...)

Industry: Information Technology > Services (0.36)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Add feedback

Learning in Congestion Games with Bandit Feedback

Neural Information Processing SystemsNov-14-2025, 03:59:25 GMT

To help address these issues, a natural approach is to consider games with special structures. In this paper, we focus on congestion games.

algorithm, bandit feedback, congestion game, (12 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.45)

Industry:

Information Technology (0.46)
Energy (0.45)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback

Censored Semi-Bandits: A Framework for Resource Allocation with Censored Feedback

Arun Verma, Manjesh Hanawal, Arun Rajkumar, Raman Sankaran

Neural Information Processing SystemsOct-2-2025, 14:33:21 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, data mining, machine learning, (19 more...)

Neural Information Processing Systems

Country: Asia > India (0.29)

Industry: Law > Civil Rights & Constitutional Law (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.30)

Add feedback

Combinatorial Bandits Revisited

Richard Combes, Mohammad Sadegh Talebi Mazraeh Shahi, Alexandre Proutiere, marc lelarge

Neural Information Processing SystemsOct-2-2025, 00:18:24 GMT

This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the stochastic setting under semi-bandit feedback, we derive a problem-specific regret lower bound, and discuss its scaling with the dimension of the decision space. We propose ESCB, an algorithm that efficiently exploits the structure of the problem and provide a finite-time analysis of its regret. ESCB has better performance guarantees than existing algorithms, and significantly outperforms these algorithms in practice.

algorithm, complexity, omb exp, (16 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.90)
Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

Finding Optimal Arms in Non-stochastic Combinatorial Bandits with Semi-bandit Feedback and Finite Budget

Neural Information Processing SystemsAug-16-2025, 12:40:52 GMT

After each decision to choose a particular arm, the learner receives some form of feedback - typically a numerical reward - determined by a feedback mechanism of the chosen arm.

bandit, data mining, machine learning, (20 more...)

Neural Information Processing Systems

Country: